Code
# Download ZHVI data
zhvi <- read.csv("https://github.com/opengeos/datasets/releases/download/us/zillow_home_value_index_by_county.csv")Comparing Linear, Ridge, and Lasso Regression Models
Zhanchao Yang
This document demonstrates the application of Population Dynamics Foundation Model (PDFM) embeddings to predict Zillow Home Value Index (ZHVI) data. PDFM embeddings are 330-dimensional vector representations that capture complex spatial and demographic patterns.
Ridge and Lasso regression are particularly useful when working with high-dimensional embeddings (330 features) because they:
# constuct correct State FIPS code and Municipal FIPS code with leading zeros
zhvi_df <- zhvi %>%
mutate(
StateCodeFIP = str_pad(as.character(StateCodeFIPS), width = 2, side = "left", pad = "0"),
MunicipalCodeFIP = str_pad(as.character(MunicipalCodeFIPS), width = 3, side = "left", pad = "0")
)
# Create place identifier
zhvi_df <- zhvi_df %>%
mutate(
place = paste0("geoId/", StateCodeFIP, MunicipalCodeFIP)
)Note: The place column creates a unique identifier for each county by combining state and municipal FIPS codes, which will be used to join with geospatial and embedding data.
# Create interactive map with Leaflet
pal <- colorNumeric(
palette = "Blues",
domain = viz_gdf[[target_date]],
na.color = "transparent"
)
leaflet(viz_gdf) %>%
addTiles() %>%
addPolygons(
fillColor = ~pal(get(target_date)),
fillOpacity = 0.7,
color = "white",
weight = 1,
popup = ~paste0(
"<strong>", RegionName, ", ", State, "</strong><br>",
"Home Value: $", format(get(target_date), big.mark = ",")
)
) %>%
addLegend(
position = "bottomright",
pal = pal,
values = ~get(target_date),
title = "Zillow Home Median Value",
opacity = 1
)